Addressing a billion-entries multi-petabyte distributed file system backup problem with cback: from files to objects
نویسندگان
چکیده
CERNBox is the cloud collaboration hub at CERN. The service has more than 37,000 user accounts. backup of and project spaces data critical for service. underlying storage system hosts over a billion files which amount to 12PB distributed thousands disks with tworeplica layout. Performing operation this vast number non-trivial task. original (an in-house event-driven file-level system) been reconsidered replaced by new scalable infrastructure based on open source tool RESTIC. system, codenamed cback , provides features needed in HEP community guarantee safety smooth from administrators. Daily snapshot-based backups all our areas along automatic verification restores are possible development. also de-duplicated blocks stored as objects disk-based S3 cluster another geographical location CERN campus, reducing costs protecting major catastrophic events. We report design operational experience running future improvement possibilities.
منابع مشابه
File System Indexing and Backup
This paper briefly proposes two operating system ideas: indexing for file systems, and backup by replication rather than tape copy. Both of these ideas have been implemented in various non-operating system contexts; the proposal here is that they become operating system functions. File System Indexing Here is a fantasy property I would like my file system to have: it should help me find the fil...
متن کاملReclaiming Space from Duplicate Files in a Serverless Distributed File System
The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from t...
متن کاملIntelligent Metadata Management for a Petabyte-scale File System
In petabyte-scale distributed file systems that decouple read and write from metadata operations, behavior of the metadata server cluster will be critical to overall system performance. We examine aspects of the workload that make it difficult to distribute effectively, and present a few potential strategies to demonstrate the issues involved. Finally, we describe the advantages of intelligent ...
متن کاملFile Allocation in Distributed Databases with Interaction between Files
In this paper, we re-examine the file allocation problem. Because of changing technology, the assumptions we use here are different from those of previous researchers. Specifically, the interaction of files during processing of queries is explicitly incorperated into our model and the cost of communication between two sites is dominated by the amount of data transfer and is independent of the r...
متن کاملEx Vivo Comparison of File Fracture and File Deformation in Canals with Moderate Curvature: Neolix Rotary System versus Manual K-files
Background and Aim: Cleaning and shaping is one of the important steps in endodontic treatment, which has an important role in root canal treatment outcome. This study evaluated the rate of file fracture and file deformation in Neolix rotary system and K-files in shaping of the mesiobuccal canal of maxillary first molars with moderate curvature. Materials and Methods: In this ex vivo exp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Epj Web of Conferences
سال: 2021
ISSN: ['2101-6275', '2100-014X']
DOI: https://doi.org/10.1051/epjconf/202125102071